If ever there was such a thing as a canon event, the 2020 COVID-19 quarantine would be one. An event so globally impactful permeates all aspects of society and leaves ripples of influence— some good, some bad (some lovely to listen to on a walk or while doing chores).
People displayed intensely regressive reactions to being confined to house arrest for months. While some rode the wave of escapism back a couple of centuries and started frantically baking loaves of bread, others regressed within their own history and revisited cherished pieces of art from their childhood. This led to a renaissance of nostalgia made novelty—through the more measured gaze of adulthood, we could revisit art that was meaningful to us as children and appreciate certain complexities we were blind to before.
For a lot of us, that meant understanding some of the raunchier humor in Cartoon Network’s Justice League Unlimited (I’m not projecting), but for Erica Ito and Carter Nakamoto, it meant revisiting Rick Riordan’s Percy Jackson and the Olympians series and starting a podcast about the greatest love story ever told—Jason Grace and cranial trauma (not really).
Seaweed Brain: A Percabeth Podcast aired its first episode in June 2020 and started discussing each book in the original series and The Heroes of Olympus series for a number of episodes, following the books chronologically. Erica and Carter would either discuss the content of the podcast by themselves or with guests, some who ended up being recurring presences on the show.
Since then, Erica and Carter have become an institution in the Percy Jackson community— They’ve done podcasts with the cast and crew of the Percy Jackson and the Olympians Disney Plus show—and the flag bearers for Percabeth supporters (anyone with any level of emotional intelligence). They’ve also gathered a diversity of opinions on a number of themes, topics, characters, and plot decisions throughout the story.
So we did what anyone would do…we methodically transcribed and downloaded each episode covering the original two series, brainstormed for relevant topics, and conducted a sentiment and importance analysis on the episodes.
This report is a quick look into what stood out as the co-hosts and guests talked through the series: what characters were the most loved or hated, how people felt about different parts of the story, and other items that we thought would be fun to include.
We’ll be doing this using three metrics in particular - polarity, subjectivity, and overall sentiment. Derived by analyzing important phrases within the context of the greater text, these measures are commonplace in natural language processing and play a large role in discerning text connotation for AI software.
Polarity is an index that measures how positively or negatively something is talked about, ranging from -1 (hate it, makes be break out in hives, absolute criminal) to 1 (clears my skin, absolves my debts, is my happy place). It’s pretty rigidly designed, looking for adjectives and other descriptive phrases around the subject and comparing them all together.
There is ambiguity in this, however — polarity can’t discern whether the phrase in question is the subject or object of the sentence, so things like sympathetic negativity (“It’s really unfortunate the way Jason’s life ended”) can be overlooked. But not to worry! In the spirit of the wiser half of Percabeth, we’ve come prepared with another metric:
Subjectivity is a measure of how expressive a statement about a subject is (subjective vs objective). This value operates on a scale from 0 (Luke is a criminal) to 1 (I think I can fix him). Subjectivity highlights personal opinions and punishes objective statements (uh, numerically speaking) so that what stands out are statements about a subject that directly correlate to the speaker’s feelings about it.
Subjectivity lacks the ability to tell in what direction the speaker is expressing their viewpoint (negative vs positive), which is why neither subjectivity nor polarity are the most reliable on their own.
It’s quite fitting that while discussing a podcast about the most legendary ships in literature, we’re employing a beloved ship in the world of machine learning.
Polarity and subjectivity (subjarity?) complete each other - they have their own identities on their own but together they become a force greater than themselves. Polarity brings color to subjectivity’s life with the highs and lows of emotion, and subjectivity keeps polarity grounded with perspective and structure. Romantic.
Sentiment, in short, is their child. Sentiment is an aggregation of polarity and subjectivity. There are many different algorithms for sentiment depending on the problem you’re trying to solve.
Sentiment for this report was calculated in the following manner:
Audio files were downloaded form the Seaweed Brain podcast website, transcribed using a mix of Otter AI and a personally developed transcription program to get some of the more obscure names. Some links were broken or didn’t transcribe well, so they were left out or curtailed to ensure each book still got as much representation as possible.
The episodes are denoted by their initials, and a number for their placement in the chronology of the episode series for that book (lo_1 refers to the first episode talking about The Last Olympian, for example):
Because the episodes for the original series follow a structure of 3 episodes per book, and the Heroes of Olympus episodes focused on smaller subsets of chapters, there are a number more episodes per book in the latter than in the former. This was a cause for concern, but it was left as is because the Heroes of Olympus books tend to be longer anyway.
We might as well get right to it - just how much better is Percabeth than other ships?
Below we compare Percabeth with other ships mentioned or alluded to.
library(tidyverse)
library(reshape2)
library(ggplot2)
library(plotly)
ships <- read.csv('ships.csv')
custom_prefix_order <- c("lt", "sm", "tc", "bl", "lo", "lh", "sn", "ma", "hh", "bo")
ships <- ships %>%
mutate(
prefix = sub("_.*", "", title), # Extract the prefix
number = as.numeric(sub(".*_", "", title)) # Extract the numeric suffix
)
ships$prefix <- factor(ships$prefix, levels = custom_prefix_order)
ships <- ships %>%
arrange(prefix, number)
ships$title <- factor(ships$title, levels = ships$title)
ships_long <- melt(ships, id.vars = "title",
measure.vars = c("Percabeth", "Gruniper", "Hank", "Jasper", "Hellie", "Tella"),
variable.name = "pol_type",
value.name = "pol_value")
p <- ggplot(ships_long, aes(x = title, y = pol_value, color = pol_type, group = pol_type)) +
geom_line() +
geom_point() +
labs(title = "Measuring Sentiment",
x = "Title",
y = "Polarity Value",
color = "Polarity Type") +
theme(axis.text.x = element_text(angle = 90, hjust = 1), # Rotate x-axis labels
plot.title = element_text(hjust = 0.5)) # Center the title
ggplotly(p) %>%
layout(width = 900, height = 600)
NOTE: Click on any item in the legend to toggle it on or off
In possibly the most obvious conclusion ever, Percabeth is in a totally different area of the plot and by far the most beloved ship on the podcast (obviously), reaching highs during House of Hades, The Last Olympian, and (surprisingly) the Lost Hero. Coming in second is Frank and Hazel, surprisingly, but second isn’t really saying much if Hedge and Mellie are on your heels.
Overall the most hated ship seems to be Jason and Piper, hitting a hard low during the first Blood of Olympus episode. This seems like a common theme throughout the episodes. An analysis wasn’t done for a fantastical Piper x Annabeth ship but it probably would have scored better.
Let’s look at the characters individually. The most-talked about main characters(from the original series and the Heroes of Olympus series, separately), supporting characters, villains, and Olympians were selected and charted:
# Generate some sample data for Plot 1
mc_sent <- read_csv('mc_sentiment.csv')
og_mains <- mc_sent %>% select(title, Percy_sent, Annabeth_sent, Thalia_sent, Luke_sent, Grover_sent, Nico_sent, Rachel_sent, Chiron_sent)
custom_prefix_order <- c("lt", "sm", "tc", "bl", "lo", "lh", "sn", "ma", "hh", "bo")
og_mains <- og_mains %>%
mutate(
prefix = sub("_.*", "", title), # Extract the prefix
number = as.numeric(sub(".*_", "", title)) # Extract the numeric suffix
)
og_mains <- og_mains %>% filter(prefix %in% c("lt", "sm", "tc", "bl", "lo"))
og_mains$prefix <- factor(og_mains$prefix, levels = custom_prefix_order)
og_mains <- og_mains %>%
arrange(prefix, number)
og_mains$title <- factor(og_mains$title, levels = og_mains$title)
og_mains_long <- melt(og_mains, id.vars = "title",
measure.vars = c("Percy_sent","Annabeth_sent","Thalia_sent",
"Luke_sent","Grover_sent","Nico_sent","Rachel_sent","Chiron_sent"),
variable.name = "pol_type",
value.name = "pol_value")
p <- ggplot(og_mains_long, aes(x = title, y = pol_value, color = pol_type, group = pol_type)) +
geom_line() +
geom_point() +
labs(title = "Measuring Sentiment",
x = "Title",
y = "Polarity Value",
color = "Polarity Type") +
theme(axis.text.x = element_text(angle = 90, hjust = 1), # Rotate x-axis labels
plot.title = element_text(hjust = 0.5)) # Center the title
# Convert to Plotly and adjust the layout
ggplotly(p) %>%
layout(width = 900, height = 600)
mc_sent <- read_csv('mc_sentiment.csv')
# Select the columns you want
og_mains <- mc_sent %>% select(title, book, Percy_pol, Annabeth_pol, Thalia_pol, Luke_pol, Grover_pol, Nico_pol, Rachel_pol, Chiron_pol)
# Calculate the averages
avg_vals <- og_mains %>% summarize(
Percy = mean(Percy_pol, na.rm = TRUE),
Annabeth = mean(Annabeth_pol, na.rm = TRUE),
Grover = mean(Grover_pol, na.rm = TRUE),
Thalia = mean(Thalia_pol, na.rm = TRUE),
Luke = mean(Luke_pol, na.rm = TRUE),
Nico = mean(Nico_pol, na.rm = TRUE),
Rachel = mean(Rachel_pol, na.rm = TRUE),
Chiron = mean(Chiron_pol, na.rm = TRUE)
)
# Convert to long format
selected_columns_long <- avg_vals %>% pivot_longer(cols = everything(), names_to = "character", values_to = "average")
# Create the plot
g <- ggplot(selected_columns_long, aes(x = character, y = average, fill = character)) +
geom_bar(stat = "identity") +
labs(title = "Average Sentiment Comparison",
x = "Character",
y = "Average Sentiment Value") +
theme(axis.text.x = element_text(angle = 90, hjust = 1), # Rotate x-axis labels
plot.title = element_text(hjust = 0.5), # Center the title
legend.title = element_text(size = 10))
# Convert to interactive plot
ggplotly(g)
mc_sent <- read_csv('mc_sentiment.csv')
# Select the columns you want
og_mains <- mc_sent %>% select(title, book, Percy_subj, Annabeth_subj, Thalia_subj, Luke_subj, Grover_subj, Nico_subj, Rachel_subj, Chiron_subj)
# Calculate the averages
avg_vals <- og_mains %>% summarize(
Percy = mean(Percy_subj, na.rm = TRUE),
Annabeth = mean(Annabeth_subj, na.rm = TRUE),
Grover = mean(Grover_subj, na.rm = TRUE),
Thalia = mean(Thalia_subj, na.rm = TRUE),
Luke = mean(Luke_subj, na.rm = TRUE),
Nico = mean(Nico_subj, na.rm = TRUE),
Rachel = mean(Rachel_subj, na.rm = TRUE),
Chiron = mean(Chiron_subj, na.rm = TRUE)
)
# Convert to long format
selected_columns_long <- avg_vals %>% pivot_longer(cols = everything(), names_to = "character", values_to = "average")
# Create the plot
g <- ggplot(selected_columns_long, aes(x = character, y = average, fill = character)) +
geom_bar(stat = "identity") +
labs(title = "Average Subjectivity Comparison",
x = "Character",
y = "Average Subjectivity Value") +
theme(axis.text.x = element_text(angle = 90, hjust = 1), # Rotate x-axis labels
plot.title = element_text(hjust = 0.5), # Center the title
legend.title = element_text(size = 10))
# Convert to interactive plot
ggplotly(g)